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Abstract 


The FastInf C++ library is designed to perform memory and time efficient approximate inference 
in large-scale discrete undirected graphical models. The focus of the library is propagation based 
approximate inference methods, ranging from the basic loopy belief propagation algorithm to prop- 
agation based on convex free energies. Various message scheduling schemes that improve on the 
standard synchronous or asynchronous approaches are included. Also implemented are a clique 
tree based exact inference, Gibbs sampling, and the mean field algorithm. In addition to inference, 
FastInf provides parameter estimation capabilities as well as representation and learning of shared 
parameters. It offers a rich interface that facilitates extension of the basic classes to other inference 
and learning methods. 


Keywords: graphical models, Markov random field, loopy belief propagation, approximate infer- 
ence 


1. Introduction 


Probabilistic graphical models (Pearl, 1988) are a framework for representing a complex joint dis- 
tribution over a set of n random variables X = {X,...X,}. A qualitative graph encodes probabilistic 
independencies between the variables and implies a decomposition of the joint distribution into a 
product of local terms: 


P(X) = TT vic, 


where C; are subsets of X defined by the cliques of the graph structure and y;(C;) are the quantitative 
parameters (potential functions) that define the distribution. Computing marginal probabilities and 
likelihood in graphical models are critical tasks needed both for making predictions and to facilitate 
learning. Obtaining exact answers to these inference queries is often infeasible even for relatively 
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modest problems. Thus, there is a growing need for inference methods that are both efficient and 
can provide reasonable approximate computations. Despite few theoretical guarantees, the Loopy 
Belief Propagation (LBP, Pearl, 1988) algorithm has gained significant popularity in the last two 
decades due to impressive empirical success, and is now being used in a wide range of applications 
ranging from transmission decoding to image segmentation (Murphy and Weiss, 1999; McEliece 
et al., 1998; Shental et al., 2003). Recently there has been an explosion in practical and theoretical 
interest in propagation based inference methods, and a range of improvements to the convergence 
behavior and approximation quality of the basic algorithms have been suggested (Wainwright et al., 
2003; Wiegerinck and Heskes, 2003; Elidan et al., 2006; Meshi et al., 2009). 

We present the FastInf library for efficient approximate inference in large scale discrete prob- 
abilistic graphical models. While the library’s focus is propagation based inference techniques, 
implementations of other popular inference algorithms such as mean field (Jordan et al., 1998) and 
Gibbs sampling are also included. To facilitate inference for a wide range of models, FastInf’s rep- 
resentation is flexible allowing the encoding of standard Markov random fields as well as template- 
based probabilistic relational models (Friedman et al., 1999; Getoor et al., 2001), through the use 
of shared parameters. In addition, FastInf also supports learning capabilities by providing param- 
eter estimation based on the Maximum-Likelihood (ML) principle, with standard regularization. 
Missing data is handled via the Expectation Maximization (EM) algorithm (Dempster et al., 1977). 

FastInf has been used successfully in a number of challenging applications, ranging from infer- 
ence in protein-protein networks with tens of thousands of variables and small cycles (Jaimovich 
et al., 2005), through protein design (Fromer and Yanover, 2008) to object localization in cluttered 
images (Elidan et al., 2006). 


2. Features 


The FastInf library was designed while focusing on generality and flexibility. Accordingly, a rich 
interface enables implementation of a wide range of probabilistic graphical models to which all 
inference and learning methods can be applied. A basic general-purpose propagation algorithm is 
at the base of all propagation variants and allows straightforward extensions. 

A model is defined via a graph interface that requires the specification of a set of cliques 
C,...C, and a corresponding set of tables that quantify the parametrization w;(C;) for each joint 
assignment of the variables in the clique C;. This general setting can be used to perform inference 
both for the directed Bayesian network representation and the undirected Markov one. 


2.1 Inference Methods 


FastInf includes implementations of the following inference methods: 


Exact inference by the Junction-Tree algorithm (Lauritzen and Spiegelhalter, 1988) 
Loopy Belief Propagation (Pearl, 1988) 

Generalized Belief Propagation (Yedidia et al., 2005) 

Tree Re-weighted Belief Propagation (Wainwright et al., 2005) 

Propagation based on convexification of the Bethe free energy (Meshi et al., 2009). 
Mean field (Jordan et al., 1998) 

Gibbs sampling (Geman and Geman, 1984) 
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By default, all methods are used with standard asynchronous message scheduling. We also imple- 
mented two alternative scheduling approaches that can lead to better convergence properties (Wain- 
wright et al., 2002; Elidan et al., 2006). All methods can be applied to both sum and max product 
propagation schemes, with or without damping of messages. 


2.2 Relational Representation 


In many domains, a specific local interaction pattern can recur many times. To represent such 
domains, it is useful to allow multiple cliques to share the same parametrization. In this case a set 
of template table parametrizations W41, ..., Wr are used to parametrize all cliques using 


rx) =>] [I w) 


t icI(t 


where Z(t) is the set of cliques that are mapped to the t’th potential. This template based represen- 
tation allows the definition of large-scale models using a relatively small number of parameters. 


2.3 Parameter Estimation 


FastInf can also be used for learning the parameters of the model from evidence. This is done 
by using gradient-based methods with the Maximum-Likelihood (ML) objective. The library also 
handles partial evidence by applying the EM algorithm (Dempster et al., 1977). Moreover, FastInf 
supports Lı and Lz regularization that is added as a penalty term to the ML objective. 


3. Documentation 


For detailed instructions on how to install and use the library, examples for usage and documentation 
on the main classes of the library visit FastInf home page at: http: //compbio.cs.huji.ac.il/ 
Fastinf. 
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